Identification of Probabilities of Languages
نویسندگان
چکیده
We consider the problem of inferring the probability distribution associated with a language, given data consisting of an infinite sequence of elements of the languge. We do this under two assumptions on the algorithms concerned: (i) like a real-life algorothm it has round-off errors, and (ii) it has no roundoff errors. Assuming (i) we (a) consider a probability mass function of the elements of the language if the data are drawn independent identically distributed (i.i.d.), provided the probability mass function is computable and has a finite expectation. We give an effective procedure to almost surely identify in the limit the target probability mass function using the Strong Law of Large Numbers. Second (b) we treat the case of possibly incomputable probabilistic mass functions in the above setting. In this case we can only pointswize converge to the target probability mass function almost surely. Third (c) we consider the case where the data are dependent assuming they are typical for at least one computable measure and the language is finite. There is an effective procedure to identify by infinite recurrence a nonempty subset of the computable measures according to which the data is typical. Here we use the theory of Kolmogorov complexity. Assuming (ii) we obtain the weaker result for (a) that the target distribution is identified by infinite recurrence almost surely; (b) stays the same as under assumption (i). We consider the associated predictions.
منابع مشابه
مقایسه روش های طیفی برای شناسایی زبان گفتاری
Identifying spoken language automatically is to identify a language from the speech signal. Language identification systems can be divided into two categories, spectral-based methods and phonetic-based methods. In the former, short-time characteristics of speech spectrum are extracted as a multi-dimensional vector. The statistical model of these features is then obtained for each language. The ...
متن کاملMultilingual Speech Recognition
We present two concepts for systems with language identification in the context of multilingual information retrieval dialogs. The first one has an explicit module for language identification. It is based on training a common codebook for all the languages and integrating over the output probabilities of language specific –gram models trained over the codebook sequences. The system can decide f...
متن کاملNew Features for Automatic Text Independent Language Identification
NEW FEATURES FOR AUTOMATIC TEXT INDEPENDENT LANGUAGE IDENTIFICATION A. Nagesh1 and V. Kamakshi Prasad2 1Mahatma Gandhi Institute of Technology, Hyderabad, India E-mail: [email protected] 2Jawaharlal Nehru Technological University, Hyderabad, India E-mail: [email protected] The objective of this paper is to explore new feature vectors for Automatic Text Independent Language Identif...
متن کاملSpoken Language Identification using Frame Based Entropy Measures
This paper presents a real-time method for Spoken Language Identification based on the entropy of the posterior probabilities of language specific phoneme recognisers. Entropy based discriminant functions computed on short speech segments are used to compare the model fit to a specific set of observations and language identification is performed as a model selection task. The experiments, perfo...
متن کاملAnother View of the Classical Problem of Comparing Two Probabilities
The usual calculation of the P-value for the classical problem of comparing probabilities is not always accurate. This issue arose in the context of a legal dispute which depended on when some written material was written in a diary. The problem raises some issues on the foundations of statistical inference.
متن کاملHomosexuality identification models: review study
Identity exploration process is one of the most important tasks that everyone should do in life which is, in fact, the same answer to question of "Who am I". During the identification process, person initially concludes that "I am a man" or " I am a woman" which is named gender identity. After defining themselves, the person chooses a partner, that this choice represents his sexual orientation....
متن کامل